Why you should care - an exciting result!

Why you should care - uh oh!

Why you should care - serious trouble

Know and care about the analysis plan!

Have a plan for data and code sharing

May I recommend?

Formulate your question in advance

Statistical inference

Variability - Scenario 1

Variability - Scenario 2

Variability - Scenario 3

Confounding

Correlation is not causation*

Randomization and blocking

  • If you can (and want to) fix a variable
  • Website always says Obama 2014 on it
  • If you don't fix a variable, stratify it
  • If you are testing sign up phrases and have two website colors, use both phrases equally on both.
  • If you can't fix a variable, randomize it

Why does randomization help?

Prediction

Prediction versus inference

Prediction key quantities

Beware data dredging

Beware data dredging

Beware data dredging

Summary

  • Good experiments
  • Have replication
  • Measure variability
  • Generalize to the problem you care about
  • Are transparent
  • Prediction is not inference
  • Both can be important
  • Beware data dredging